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Quarterly  Technical  Report  #4 
Euclid  Compiler  Project 

Report  Summary 


The  Euclid  Compiler  Project  is  to  provide  for  the  PDP-11/45 
computer  architecture  under  the  operating  system,  UNIX,  a 
compiler  for  the  language,  Euclid.  The  compiler  is  multi-pass, 
highly  portable,  and  written  in  subset  of  Euclid  itself.  The 
subset  is  known  as  Small  Euclid,  and  was  chosen  specifically  to 
permit  bootstrapping. 

The  problems  associated  with  this  project  have  been  the  result 
of  a new,  highly  sophisticated  language  requiring,  for  a relatively 
small  machine,  a compiler  with  features  associated  with  the 
verifiability  of  resulting  code.  Techniques,  already  in  use  in 
compiler  writing,  had  to  be  adapted  to'assist  in  handling  the 
peculiarities  of  the  language  and  some  previously  unaddressed 
problems  associated  with  the  production  of  verifiable  code.  While 
the  problems  are  all  essentially  solved,  the  time  taken  to  create 
the  compiler  has  lengthened  because  of  the  inherent  complexity 
of  the  language  which  was  not  previously  recognized. 

The  language  will  probably  have  to  be  redefined  to  some 
extent  because  of  its  inherent  complexity.  A subset,  known  as 
Middle  Euclid,  requested  by  the  KSOS  team,  has  been  defined;  eind 
the  first  compiler,  due  now  in  May,  will  define  zmother  subset 
between  Small  Euclid  and  Middle  Euclid.  Somewhat  more  effort 
would  permit  this  compiler  to  attain  Middle  Euclid  as  the 
language  it  is  capable  of  handling. 


The  verification  aspects  of  the  Icinguage  are  significant  and 
some  of  the  mechanisms  for  handling  them  are  in  place  in  various 
passes  of  the  compiler  where  they  are  appropriate.  The  code  to 
come  from  the  compiler  ought  to  be  accompanied  by  an  augmented 
source  stream  indicating  the  insertion  of  compiler-generated 
assertions. 

Future  research  is  indicated  to  obtain  a stable  useful 
subset  of  Euclid.  The  use  of  this  subset,  or  of  the  one  actually 
accepted  by  the  compiler  in  the  reworking  of  KSOS,  seems  most 
desirable.  The  verification  of  Euclid  code  also  deserves  future 
attention. 
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The  Euclid  Compiler  project  was  established  to  create  for 
the  PDP-11/45  computer  architecture  a compiler  for  the  Icinguage^ 
Euclid^to  work  under  the  operating  system,  UNIX.  It  was 
designed  to  be  a multi-pass  compiler,  written  in  Euclid  itself 
(actually  a subset  specifically  chosen  to  permit  bootstrapping 
and  known  as  Small  Euclid) , and  expected  to  be  as  portable  to 
other  machines  as  possible. 

The  overall  process,  now  nearing  completion,  consisted  of 
the  following  steps: 

1)  Define  the  subset  language.  Small  Euclid. 

2)  Create  a transliterator  of  that  subset  to  the  UNIX 
language,  C. 

3)  Adapt  the  tool,  created  at  the  University  of  Toronto, 
known  as  SSL  (the  Syntax  Semantics  language) , to  be 
available  to  support  the  Euclid  subset. 

4)  Divide  the  proposed  compiler  into  its  component  passes 
- (6  in  all:  the  Screener/Scanner , the  Parser,  the 
Builder,  the  Conformance  Pass,  the  Allocator,  eind  the 
Coder)  and  design  each. 

5)  Do  the  preliminary  overall  design  work  including  the 
anticipation  of  the  problems  likely  to  be  encountered 
and  some  possible  solutions.  (A  library  of  working 
papers  is  available.) 

6)  Write  the  6 passes.  The  first  two,  complete  for  some 
time,  were  adapted  from  the  original  Small  Euclid 
Transliterator.  The  remaining  four  are  all  table 
driven:  the  creation  of  the  table  is  the  task  of  the 
SSL  and  the  support  routines  are  coded  in  Small  Euclid. 

7)  Test  the  passes. 

8)  Bootstrap  the  compiler. 
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During  the  execution  of  this  set  of  tasks  several  technical 
problems  have  arisen.  Notably,  the  interaction  of  the  linguistic 
features  of  the  language  have  been  noted  in  great  detail  and  the 
resulting  complexity  recognized.  As  a result  the  compiler  will 
itself  be  much  larger  than  anticipated  and  will  handle  a 
language  which  is  not  the  full  Euclid  as  defined.  The  limits  to 
the  language  to  be  compilable  are  essentially  those  imposed  by 
its  complexity  and  by  the  time  available  to  write  the  compiler. 

The  principal  conclusion  reached  during  the  past  few  months 
is  that  the  leinguage  itself  is  much  more  involved  and  complex 
them  ever  imagined  at  the  beginning  of  the  contract.  It  is 
likely  that  a redefinition  will  become  necessary  in  the  future  to 
permit  any  enhancements  to  be  built  onto  a stable  and  manageable 


base.  The  first  redefinition,  or  definition  of  a subset, will  be 
the  de-facto  language  specified  by  acceptability  to  the  compiler. 

Future  research  will,  in  all  likelihood,  be  influenced  by  I 

this  redefinition.  Although  the  language  was  originally  designed 
to  write  verifiable  code,  especially  that  required  for  security 
kernels  and  other  "small"  blocks  of  software,  it  has  become  very 
powerful.  The  team  writing  KSOS  asked  for  a subset  of  Euclid 
which  they  believed  to  be  adequate  for  their  tasks  (named  Middle 
Euclid)  and  although  it  was  never  made  available  to  them  (the 
work  involved  being  too  great  and  the  time  needed  much  more 
them  expected)  this  definition  survives.  The  first  compiler, 
at  the  end  of  April  1979  is  likely  to  be  capable  of  handling 
more  than  Small  Euclid  but  less  than  Middle  Euclid. 

It  is  expected  that  some  effort  will  still  be  placed  in  the 
rewriting  of  the  security-sensitive  portion  of  KSOS  in  whatever 


Euclid  is  available.  This  is  because  verification  efforts  still 
seem  to  be  oriented  towards  some  of  the  conditions  which  Euclid 


imposes.  The  Euclid  which  is  available,  therefore,  will 
materially  affect  the  way  in  which  these  portions  of  KSOS  will 
be  coded. 

An  effort  to  permit  verification  of  Euclid  code  seems  to  be 
a desirable  future  path.  As  Euclid  becomes  more  widely  accepted 
the  amoiont  of  code  to  be  verified  will  surely  increase  and  well 
established  techniques  seem  to  be  a requirement  in  that  case. 
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Appendix 


In  keeping  with  the  last  quarterly  report  a table  is  presented 
below  to  indicate  for  each  of  the  three  languages  (Small  Euclid  - 
the  language  for  bootstrapping  the  compiler  itself;  Middle  Euclid  - 
the  language  originally  requested  by  Ford  Aerospace  for  KSOS;  Full 
Euclid  - the  language  described  in  the  report  of  the  Committee) 
what  percentages  of  the  language  will  be  available  in  the  April 
compiler  by  each  of  the  6 passes. 

The  Issues  of  verifiability  are  separate.  The  two  aspects  of 
Euclid  which  enable  it  to  produce  verifiable  code  are: 

(1)  Access  control  (imports,  exports,  and  aliasing) ; 

(2)  Assertions. 

It  is  expected  that  by  April  there  will  be  work  leading  to  the  easy 
insertion  of  features  to  the  compiler  to  permit  (a)  its  generation  of 
some  assertions  (by  Pass  4 - the  Conformance  Pass)  and  (b)  its 
enforcing  of  imports  and  exports  (by  Pass  3,  the  Builder).  Neither 
feature  is  likely  to  be  in  the  April  compiler,  but  both  are 
moderately  simple,  if  time  consuming,  additions. 


Table  1 

Percentage  of  language  capabilities  to  be  handled  by  each  pass, 
April  1979. 


Lcinguage 

Pass  1 

2 

3 

4 

5 

6 

Small  Euclid 

100 

100 

100 

100 

100 

100 

Middle  Euclid 

100 

100 

95 

95 

95 

60 

Full  Euclid 

100 

100 

95 

95 

95 

50 
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